-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capture when mongoSocket.conn is nil to prevent panics #38
Conversation
👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in isolation looks good.
My only concern is how would the caller handle this error and what is it supposed to do? Some docs on retrying the error would be great, if thats what we are supposed to do.
+1 but I am not too sure about the internals see my comment here #38 (review) @mikramulhaq can you also please take a look. |
👍 |
I just reviewed the code and don't see any issues with this change. There is a race around socket handling that could've led to this and that Chris's PR will fix. Details below:
If a call is made to Adding a nil check and logging here might let us know if this was the root cause. Line 237 in 5008222
|
Overall this codebase is pretty shaky. Might be worth looking at and potentially pulling in some of the changes in https://github.com/globalsign. I'm not sure if this relates, but globalsign#69 looks interesting. |
Also, Lines 226 to 243 in 5008222
Lines 254 to 265 in 5008222
Basically, the lock would need to be held for the entirety of |
It looks like we introduced the racy code (61b7e27). I think we fix that as well and add logging around the nil checks. Another PR to get the tests working would also be good since they're currently broken. |
Just for cross referencing, I'm fixing this in #40 |
Panic:
Tracing this down assuming sourcegraph is exhaustive:
socket.conn
newSocket
Connect
where the connection is dialedLooking at the dial logic in
Connect
, it chooses between the stdlib, or one a new/old dialer. The only way conn could be nil is if the dialers return (nil, nil), which should only happen on a bad custom dialer implementation. The code that encountered this does not specify a custom dialer and is likely using the stdlib dialer, which should never return (nil, nil). Adding a nil check here just to be safe. As connections should be long-lived, I'd expect this change to be negligible performance-wise.And, treating the symptom, also doing a nil check on
socket.conn
in thekill
method.